An Analysis of Usage Locality for Data-Centric Web Services (TR-2005-866)
نویسندگان
چکیده
The growing popularity of XML Web Services is resulting in a significant increase in the proportion of Internet traffic that involves requests to and responses from Web Services. Unfortunately, web service responses, because they are generated dynamically, are considered “uncacheable” by traditional caching infrastructures. One way of remedying this situation is by developing alternative caching infrastructures, which improve performance using on-demand service replication, data offloading, and request redirection. These infrastructures benefit from two characteristics of web service traffic — (1) the open nature of the underlying protocols, SOAP, WSDL, UDDI, which results in service requests and responses adhering to a well-formatted, widely known structure; and (2) the observation that for a large number of currently deployed data-centric services, requests can be interpreted as structured accesses against a physical or virtual database — but require that there be sufficient locality in service usage to offset replication and redirection costs. This paper investigates whether such locality does in fact exist in current web service workloads. We examine access logs from two large data-centric web service sites, SkyServer and TerraServer, to characterize workload locality across several dimensions: data space, network regions, and different time epochs. Our results show that both workloads exhibit a high degree of spatial and network locality: 10% of the client IP addresses in the SkyServer trace contribute to about 99.95% of the requests, and 99.94% of the requests in the TerraServer trace are directed towards regions that represent less than 10% of the overall data space accessible through the service. Our results point to the substantial opportunity for improving Web Services scalability by ondemand service replication.
منابع مشابه
DataSlicer: A Hosting Platform For Data-Centric Network Services
As theWeb evolves, the number of network services deployed on the Internet has been growing at a dramatic pace. Such services usually involve a massive volume of data stored in physical or virtual back-end databases, and access the data to dynamically generate responses for client requests. These characteristics restrict use of traditional mechanisms for improving service performance and scalab...
متن کاملCustom Memory Allocation for Free Improving Data Locality with Container-Centric Memory Allocation
We propose a novel container-centric memory allocation scheme. In this scheme, the container’s semantics guide the memory allocation, which results in data locality improvement and execution time reduction. The container-centric allocation provides the benefits of custom memory allocation, with the portability advantage. Applications need not change a single line of code, but rather change the ...
متن کاملScalable Locality-Sensitive Hashing for Similarity Search in High-Dimensional, Large-Scale Multimedia Datasets
Similarity search is critical for many database applications, including the increasingly popular online services for Content-Based Multimedia Retrieval (CBMR). These services, which include image search engines, must handle an overwhelming volume of data, while keeping low response times. Thus, scalability is imperative for similarity search in Webscale applications, but most existing methods a...
متن کاملThe Semantic Web - Semantics for Data and Services on the Web
the semantic web semantics for data and services on the the semantic web semantics for data and services on the the semantic web toc the semantic web semantics for data and services on the data-centric systems and applications data-centric systems and applications the semantic web semantics for data and services on the the semantic web toc beck-shop applying semantic web services to web-based i...
متن کاملAutomating the Generation of Joins in Large Databases and Web Services
In this data-centric world, as web services and service oriented architectures gain momentum and become a standard for data usage, there will be a need for tools to automate data retrieval. In this paper we propose a tool that automates the generation of joins in a transparent and integrated fashion in heterogeneous large databases as well as web services. This tool reads metadata information a...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005